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Abstract—The theory of belief functions is widely used for data 
from multiple sources. Different evidence combination rules have 
been proposed in this framework according to the properties 
of the sources to combine. However, most of these combination 
rules are not efficient when there are a large number of sources. 
This is due to either the complexity or the existence of an 
absorbing element such as the total conflict mass function for the 
conjunctive based rules when applied on unreliable evidence. In 
this paper, based on the assumption that the majority of sources 
are reliable, a combination rule for a large number of sources 
is proposed using a simple idea: the more common ideas the 
sources share, the more reliable these sources are supposed to 
be. This rule is adaptable for aggregating a large number of 
sources which may not all be reliable. It will keep the spirit of 
the conjunctive rule to reinforce the belief on the focal elements 
with which the sources are in agreement. The mass on the empty 
set will be kept as an indicator of the conflict. 

The proposed rule, called LNS-CR (Conjunctive combination 
Rule for a Large Number of Sources), is evaluated on synthetic 
mass functions. The experimental results verify that the rule can 
be effectively used to combine a large number of mass functions 
and to elicit the major opinion. 


Index Terms—Theory of belief functions, big data, combina- 
tion, large number of sources, reliability 


I. INTRODUCTION 


N recent years, Dempster-Shafer Theory (DST), also called 

the theory of belief functions, has gained increasing at- 
tention in the scientific community as it allows to the deal 
with the imprecise and uncertain information. It has been 
applied in various domains, such as data classification [2, 3], 
data clustering [4, 5], social network analysis [6], etc. In 
complex environment, multiple stake-holders attempt to reach 
a decision by combining several sources of information and ag- 
gregating their points of view by stressing common agreement. 
The theory of belief functions, which has provided many rules 
to combine information represented by mass functions [7], are 
widely used for decision making. In real applications, there 
are usually a large number of sources. Most of the existing 
combination rules are not applicable in this case, and cannot 
be used to find the major opinion from many participants. 

One of the most famous combination rule in belief function 
framework is the Dempster’s rule [7]. Smets [8] proposed 
a modification of Dempster’s rule, often called “conjunctive 
tule”, where the empty set can be assigned with a non-null 
mass under the Transferable Belief Model (TBM) [9]. In fact, 
the conjunctive rule is equivalent to the Dempster rule without 


This paper is an extension and revision of [1]. 


the normalization process. It has a fast and clear convergence 
towards a solution. But this rule has a strong assumption that 
all the sources are reliable. In real applications, it is difficult to 
be either satisfied or verified. Moreover, the more sources there 
are, the more chance that there is some unreliable evidence. 

Smets [8] reasoned that the mass on the empty set can 
play the role of alarm. When the global conflict (the mass 
assigned to the empty set) is high, it indicates that there is 
strong disagreement among the sources of mass functions to 
combine. However, as observed in [10, 11, 12], the mass on the 
empty set is not sufficient to exactly describe the conflict since 
it includes an amount of auto-conflict [13]. Sometimes when 
there is only a small amount of concordant evidence, the total 
conflict mass function, i.e. m(@) = 1 will be an absorbing 
element. Consequently, when combining a large number of 
(incompatible) mass functions using the conjunctive rule, the 
global conflict may tend to 1. This makes it impossible to 
reveal the cause of high global conflict. We do not know 
whether it is due to the sources to fuse or caused by the 
absorption power of the empty set [10, 14]. In other words, 
even the combined mass function by the conjunctive rule 
is m(@) ~ 1, the proposition that the sources are highly 
conflicting may be incorrect. 

In order to rectify the drawbacks of the classical Dempster’s 
rule and Smets’ conjunctive rule, many approaches have been 
made through the modification of the combination rule. Some 
authors tried to find alternative repartitions of the conflict. A 
plethora of combination rules have been brought forward in 
this way. For example, Yager [15] and Dubois and Prade [16] 
suggested assigning the highly conflicting mass to the whole 
set or a particular set. The Proportional Conflict Redistribution 
(PCR) rule, which can distribute the partial conflicts among 
the involved focal elements rather than to their union, is 
developed in [13, 17]. Apart from these approaches working 
directly on the combination rule, some studies manage the 
conflict through evidence discounting, where the reliability 
of sources is automatically and adaptively taken into account 
[10, 16, 18, 19]. 

Most of the existing combination rules are not efficient when 
applied on a large number of sources due to the ineffective way 
to handle conflict or the high complexity of the computation. 
Orponen [20] proved that the complexity of the conjunctive 
rule is NP-hard, but the complexity depends on the way to 
program the belief functions [21]. Some rules can manage 
efficiently the conflict but have large complexity [13, 16, 22, 
23], making them infeasible when applied to combine a large 


number of mass functions. 

In this paper, a conjunctive-based combination rule, named 
LNS-CR (Large Number of Sources), is proposed to aggregate 
a large number of mass functions. Our perspective on belief 
function combination is that combining mass functions from 
different sources is similar to combining opinions from multi- 
ple stake-holders in group decision-making [24], i.e. the more 
one’s opinion is consistent with the other experts, the more 
reliable the source is. We assume that all the mass functions 
available are separable mass functions, which means they can 
be expressed by a group of simple support mass functions. In 
many applications, the mass assignments are directly in the 
form of Simple Support Functions (SSF) [25]. The advantage 
of SSFs is that we can group the mass functions in such a 
way that sources in the same group share the same viewpoint. 
Mass functions in each small group are first fused and then 
discounted according to the proportions. After that the number 
of mass functions participating the next global combination 
process is independent of the number of sources, but only 
depends on the number of classes. As a result, the problem 
brought by the absorbing element (the empty set) using the 
conjunctive rule can be avoided. Moreover, an approximation 
method when the number of mass functions is large enough is 
presented. The main contributions of this paper are as follows: 

e A new conjunctive-based combination rule, named 
LNS-CR rule, is brought froward. The property to re- 
inforce the belief on the focal elements with which most 
of the sources agree is preserved in the proposed rule; 

e The assumption of the LNS-CR rule on the reliability of 
the sources is more relaxed, as it does not require all the 
sources are reliable, but only at least half of them are 
reliable. 

e LNS-CR can be used to combine mass functions from a 
large number of sources, especially can be used to elicit 
the major opinion; 

Derivation that the LNS-CR rule is within acceptable 
complexity. 


The rest of this paper is organized as follows. In Section 
2, some basic knowledge of belief function theory is briefly 
introduced. The proposed evidence combination approach is 
presented in detail in Section 3. Numerical examples are 
employed to compare different combination rules and show 
the effectiveness of LNS-CR rule in Section 4. Finally, Section 
5 concludes the paper. 


II. BACKGROUND 
A. Basic knowledge of belief function theory 
Let O = {01, 02, . . . , On } be the discernment frame. A mass 
function is defined on the power set 2° = {A : A C ©}. The 
mass function m : 2° — [0,1] is said to be a Basic Belief 
Assignment (bba) on 29, if it satisfies: 


X m(A)=1. (1) 
ACO 
Every A € 2° such that m(A) > 0 is called 


a focal element, and the set of focal elements is de- 
noted by F. In a practical way of programming, the 


element of 2° can be arranged by natural order [26]: 
01,02, {01, 02}, O3,--- ,{01, 2,03}, 04,- , 0. 

The frame of discernment can also be a focal element. 
If © is a focal element, the mass function is called non- 
dogmatic. The mass assigned to the frame of discernment, 
m(®), is interpreted as a degree of ignorance. In the case 
of total ignorance, m(©) = 1. This type of mass assign- 
ment is vacuous. If there is only one focal element, i.e. 
m(A) = 1,A C O, the mass function is categorical. Another 
special case of assignment is named consonant mass functions, 
where the focal elements include each other as a subset, i.e. 
ifA,BEF,ACBor BCA. 

The credibility and plausibility functions are derived from 
a bba m as in Eqs. (2) and (3): 


Bel(A)= X` mB), VACO, (2) 
BCA,B#O 
PUA)= X. mB), VACO. (3) 


BOAO 


Each quantity Bel(A) measures the minimal belief on A 
justified by available information on B(B C A) , while 
PI(A) is the maximal belief on A justified by information 
on B which are not contradictory with A (AN B Æ Ø). The 
commonality function q and the implicability function b are 
defined respectively as 


q(A) = 5) m(B), YACO (4) 
ACB 
and 
b(A) = Bel(A) + m(0), VAC O. (5) 


A bba m can be recovered from any of these functions. For 
instance, 


m(A) = 55 (-1)F-4lq(B), VACe (6) 
BDA 
and 
m(A)= X (-1)4-|Flo(B), VAC ©. (7) 


BCA 


Belief functions can be transformed into a probability 
function by Smets’ method [27], where each mass of belief 
m(A) is equally distributed among the elements of A. This 
leads to the concept of pignistic probability, BetP. For all 
6; € O, we have 


m(A) 


E FAT = mO) 


» 


ACO|O;EA 


(8) 


where |A| is the cardinality of set A (number of elements of 
© in A). Pignistic probabilities can help make a decision. 


B. Consistency of mass assignments 


The consistency between two bbas can be defined in two 
different ways. Suppose the sets of focal elements for mı and 
mz are F, and Fa respectively. Mass functions mı and m2 
are called strong consistent if and only if 


 REe{FiU Fo} # 0. (9) 


Meanwhile, bbas mı and mz are called weak consistent if and 
only if 


VAER, BEF, ANBEZD. (10) 


Strong consistent evidence means that there is at least one 
element that is common to all subsets [28]. It is easy to see 
that, when m; and mz are strong consistent, they are sure to be 
weak consistent. This is the definition of consistency between 
belief functions. The inconsistency within an individual mass 
assignment can be defined similarly [12]. 


C. Reliability-based discounting 


When the sources of evidence are not completely reliable, 
the discounting operation proposed by Shafer [25] and justified 
by Smets [29] could be applied. Denote the reliability degree 
of mass function m by a € [0,1], then the discounting 
operation can be defined as: 


mia {ox VAC, 


11 
if A=0O. N 


l-a+ax m(O) 
If a = 1, the evidence is completely reliable and the bba will 
remain unchanged. On the contrary, if œ = 0, the evidence 
is completely unreliable. In this case the so-called vacuous 
belief function, m(©) = 1, could be got. It describes the total 
ignorance. 

Before evoking the discounting process, the reliability of 
each sources should be known. One possible way to estimate 
the reliability is to use confusion matrices [30]. Generally, 
the goal of discounting is to reduce global conflict before 
combination. One can assume that the conflict comes from 
the unreliability of the sources. Therefore, the source reliability 
estimation is to some extent linked to the estimation of conflict 
between sources. 

Hence, Martin et al. [10] proposed to use a conflict measure 
to evaluate the relative reliability of experts. Once the degree 
of conflict is computed, the relative reliability of the source 
can be computed accordingly. Suppose there are S sources, 
S = {81,$2,-:- , Sg}, the reliability discounting factor a; of 
source s; can be defined as follows: 


a; = f (Conf (s,;,S)), (12) 


where Conf (sj, S) quantifies the degree that source s; con- 

flicts with the other sources in S, and f is a decreasing 

function. The following function is suggested by the authors: 
1 


` 


a= (1 = Conf (s;,5)*) ; (13) 


where A > 0. 

In [31], the authors considered to use those two possible 
conflict origins, extrinsic measure and intrinsic measure, to 
estimate reliability. In their opinion, conflict may not only 
come from the source’s contradiction (extrinsic measure), but 
also from the confusion rate of a source (intrinsic measure). 
The reliability discounting factor, called Generic Discounting 
Factor (GDF), is then suggested to be a weighted sum of the 
two items: 

ko +16 


= 14 
as (14) 








where k > 0,/ > 0 are the weight factors. In the above 
equation, ô denotes the internal conflict measure of the 
treated source indicating its confusion rate while ( is the 
average distance between the treated sources s; and s; where 
j € S,j #1. Different intrinsic and extrinsic conflict measures 
can be adopted here. 

There are some other methods to estimate the reliability. 
In [32], the authors proposed to estimate the reliability of 
sources based on a degree of falsity. The bbas are sequentially 
and incrementally discounted until the mass assigned to the 
empty set is smaller than a given threshold k. After that 
the discounted mass functions can be combined using the 
conjunctive rule since there is little global conflict at this time. 
In [33], the source reliability is obtained by minimizing the 
distance between the pignistic probabilities computed from the 
discounted beliefs and the actual value of the data. In Samet 
et al. [34], the authors proposed two different versions of 
generic discounting approaches: weighted GDA and exponent 
GDA. A new degree of disagreement is proposed by Yang et al. 
[35], where the reliability discounting factor can be generated. 
Klein and Colot [36] viewed the degree of conflict as a 
function of discounting rates and introduced a new criterion 
assessing bbas’ reliability. These reliability estimation methods 
either consider the distance (or dissimilarity) between each 
pair of bbas, or the mass assigned to the empty set after the 
conjunctive combination. However, these methods are of high 
complexity and not suitable for large data applications. 


D. Simple support function 


Suppose m is a bba defined on the frame of discernment ©. 
If there exists a subset A C © such that m could be expressed 
in the following form: 


w X =O, 
mxX)=<l-w X=A, (15) 
0 otherwise. 


where w € [0,1], then the belief function related to bba m 
is called a Simple Support Function (SSF) (also called simple 
mass function) [25] focused on A. Such a SSF can be denoted 
by A” (-) where the exponent w of the focal element A is the 
basic belief mass (bbm) given to the frame of discernment 
©, m(O). The complement of w to 1, ie. 1 — w, is the bbm 
allocated to A [37]. If w = 1 the mass function represents the 
total ignorance, if w = 0 the mass function is a categorical 
bba on A. 

A belief function is separable if it is a SSF or if it is the 
conjunctive combination of some SSFs [38]. In the work of 
[38], this kind of separable masses is called u-separable where 
“u” stands for “unnormalized”’, indicating the conjunctive rule 
is the unnormalized version of Dempster-Shafer rule. The set 
of separable mass functions is not obvious to obtain. It is easy 
to see consonant mass functions (the focal element are nested) 
are separable [39]. Smets [37] defined the Generalized Simple 
Support Function (GSSF) by relaxing the weight w to [0, co). 
Those GSSFs with w € (1,00) are called Inverse Simple 
Support Functions (ISSF). Smets proved all non-dogmatic 
mass functions are separable if one uses GSSFs. For any 


non-dogmatic belief function mo, the canonical decomposition 
method proposed by Smets is as follows. First, calculate the 
commonality number for all focal elements, which is given by 


Qo(X) = S$) mo(B). 


BDX 


(16) 


Secondly for any A C 0, calculate w4 value as follows: 


wa= [J Q(x)? 


XDA 


|X|—|A]+1 


(17) 


Then the belief function mg can be represented by the con- 
junctive combination of all the functions A,,,, ie. 


@ 4" 
? 
ACO 


mo = (18) 
where ©) denotes the conjunctive combination rule. For fast 
computation, the Fast Möbius Transform (FMT) method [40] 
can be evoked. 


E. Some combination rules 


How to combine efficiently several bbas coming from 
distinct sources is a major information fusion problem in the 
belief function framework. Many rules have been proposed for 
such a task. Here we just briefly recall how some most popular 
rules are mathematically defined. 

When information sources are reliable, the used fusion 
operators can be based on the conjunctive combination. If bbas 
mj, j =1,2,--- ,S describing S distinct items of evidence on 
O, the included result of the conjunctive rule [9] is defined 
as 


S 
So [m 


Yin--AYs=X j=l 
(19) 


where m;(Y;) is the mass allocated to Y; by expert j. To apply 
this rule, the sources are assumed reliable and cognitively 
independent. 

Another kind of conjunctive combination is Dempster’s 
rule [41]. Assuming that mconj(0) # 1, the result of the 
combination by Dempster’s rule is 


Meonj(X) = (- O m;)(X) = 
j=l, S 


0 if X =0, 
Meonj(X) 
1=™Mconj (9) 


™Dempster (X) = [ (20) 


otherwise. 


The item 


s 
k Ê Meonj (0) = 5 Il m;(Y;) 


YiN--NYg=0 j=1 


is generally called Dempster’s degree of conflict of the com- 
bination or the inconsistency of the combination. As the con- 
junctive rule is not idempotent, Meconj (Ø) includes an amount 
of auto-conflict [42], and it is called global conflict to make 
the difference. 

The conjunctive rule can be applied only if all the experts 
are reliable. In the other case, the disjunctive rule [43], which 
only assumes that at least one of the sources is reliable, can be 


used. The disjunctive combination of © sources can be defined 
as 


S 
5o [mo 


Y1U---UYs=X j=l 
(21) 


The conjunctive and disjunctive rules can be conveniently 
expressed by means of the commonality function q (Eq. (4)) 
and the implacability function b (Eq. (5)) [43]. Let q; and b; be 
the commonality function and implacability function respec- 
tively (associated with m,), then the commonality function of 
the conjunctive combination of S bbas is 


S 
[[a(4. vace 


i=l 


conj (A) = (22) 


while the implacability function of the disjunctive combination 
of S bbas is 


s 
baisj( 4) = Il b(A), VACO. 


i=l 


(23) 


Since functions m, q and b (as well as bel and pl) are 
equivalent representations, the mass function m can be recov- 
ered using the Fast Mobius Transform (FMT) method given 
the functions q and b. The conversion can be done in time 
proportional to n2” [44]*. For the conjunctive combination of 
S sources, the S bbas should be converted into commonality 
functions first. After calculating the product of S commonality 
functions, another transformation from m to q should be 
evoked. Overall the total complexity is O(Sn2” +82” +n2”), 
and the time needed is proportional to Sn2” [44, 45]. 

The conflict could be redistributed on partial ignorance like 
in the Dubois and Prade rule (DP rule) [16], which can be 
seen as a mixed conjunctive and disjunctive rule. For all X C 


0,X #90: 


s 
mpp(X) = Se [[ m+ 


Yin ANYs=X j=1 


> IEZA (24) 
Y¥,U---UYg =X j=l 
YiN---NYg =O 


where mj is the mass function delivered by expert j. In a 
general case, this rule cannot be programmed with the Fast 
Mobius Transform method because all the partial conflict must 
be considered. If the implementation is made like that in 
Ref. [46], it takes much more time than the conjunctive rule. 

Denceux [38] proposed a family of conjunctive and disjunc- 
tive rules using triangular norms. The cautious rule [47, 48] 
belongs to that family and could be used to combine mass 
functions for which independence assumption is not verified. 
Cautious combination of S non-dogmatic mass functions 


*This is based on the assumption that the mass functions are arranged in 
natural order. If not, the complexity is proportional to n?2”. The complexity 
analysis in this work all assumes that the bbas to be combined are encoded 
using the natural order. 


m;,j =1,2,--- ,S is defined by the bba with the following 
weight function: 
w(A) (25) 


A w;(A), A€2°\0. 
J= 


We thus have 


™MCautious (X) = ; (26) 
where A;(A4) is the simple support function focused on A 
with weight function w,(A) issued from the canonical decom- 
position of mj. Note also that ^ is the min operator. The time 
consumption of the cautious rule includes the canonical de- 
composition of non-dogmatic mass functions and is therefore 
bigger than the conjunctive rule. If this rule is implemented in 
Fast Mobius Transform method, the complexity is proportional 
to Sn2”. 

Murphy [49] presented the average combination rule and 
proposed to utilize the mean of the basic belief assignments 
as the fusion of evidence. Therefore, for each focal element 
X € 2° of S mass functions, the combined one is defined as 
follows: 

S 
mave(X) = 3 D m;(X), YX CO. (27) 
j=l 
The complexity of the average is proportional to $2”. 

A family of fusion rules based on new Proportional Conflict 
Redistributions (PCR) for the combination of uncertainty 
and conflicting information have been developed in Dezert- 
Smarandache Theory (DSmT) framework [50]. Among them, 
the fusion rule called PCR6 proposed by Martin and Osswald 
[13] is one of the most popular one among the PCR rules. For 


the combination of S > 2 sources, the fused mass is given by 
mpcre(0) = 0, and for X 4 0 in 2° 


s 
mpor6(X) = Meonj(X) + D (m; (X)? x 


S-1 
TL mao Yao) 
j=1 





D 


S-1 
(XFS mao (Yo 
Meat Yo) OX = PE ea) 


Ma » You(s—a) € (29)°™ 
(28) 
where o; counts from 1 to S avoiding i: 
sys a ee 
SD (29) 
cilj =j+1 if ji 


As Y; is a focal element of expert/source i, we have m(Y;) > 
0. Then 


S—1 
ma(X) + J mao) (Yang) #0. 
j=l 


In Eq. (28), Meconj is the conjunctive rule given by Eq. (19). 
Here again, the Fast Möbius Transform method to program the 
belief functions is not generally the best way. If the implemen- 
tation is made like that in Ref. [46], the time consumption is 
very high. 


III. A COMBINATION RULE FOR A LARGE NUMBER OF 
MASS FUNCTIONS 


The main idea of the conjunctive combination rule is to 
reinforce the belief on the focal elements with which most of 
the sources agree. Martin et al. [10] showed that the mass on 
the empty set, which is an absorbing element, tends quickly 
to 1 with the number of sources when combining inconsistent 
bbas. Consequently, when using Dempster rule (Eq. (20)), the 
gap between « and | may rapidly exceed machine precision, 
even if the combination is valid theoretically. In that case the 
fused bba by the conjunctive rules (normalized or not) and the 
pignistic probability are inefficient. Moreover, the assumption 
that all the sources are reliable for the conjunctive combination 
rule is difficult to reach in real applications. The more sources 
there are, the less chance that this assumption is valid. 

The principle of the conjunctive rule with the reinforcement 
of belief and the role of the empty set as an alarm are essential 
in the theory of belief functions. In order to propose a rule 
which can be adapted to the combination of a large number of 
mass functions and keep the previous behavior, the following 
assumptions are made: 


e The majority of sources are reliable; 

e The larger extent one source is consistent with others, 
the more reliable the source is; 

e The sources are cognitively independent [43]. 


These assumptions seem reasonable if we consider combing 
mass functions as some kind of group decision making prob- 
lems. As a result, the proposed rule will give more importance 
to the groups of mass functions that are in a domain, and 
it is without auto-conflict [13, 14]. In order to take into 
account this effect, this rule will discount the mass functions 
according to the number of sources giving bbas with the same 
focal elements. The discounting factor is directly given by the 
proportion of mass functions with the same focal elements. 
This procedure is for the elicitation of the majority opinion. 

The simple support mass functions are considered here. In 
this case, the mass functions can be grouped in the light of 
their focal elements (except the frame ©). To make the rule 
applicable on separable mass functions, the decomposition 
process should be performed to decompose each bba into 
simple support mass functions. In most of applications, the 
basic belief can be defined using separable mass functions, 
such as simple support functions [2] and consonant mass 
functions [51, 52]. 

Hereafter we describe the proposed LNS-CR rule for sim- 
ple support functions, and then an approximation calculation 
method of LNS-CR rule is suggested. 


A. LNS-CR rule for simple support functions 


Suppose that each evidence is represented by a SSF. Then 
all the bbas can be divided into at most 2” groups (where 
n = |O|). It is easy to see that there is no conflict at all 
in each group because of consistency. The focal elements of 
the SSF are singletons and © itself. For the combination of 
bbas inside each group, the conjunctive rule can be employed 
directly. Then the fused bbas are discounted according to the 
number of mass functions in each group. Finally, the global 


combination of the bbas of different groups is preformed also 
using the conjunctive rule. Suppose that all bbas are defined on 
the frame of discernment © = {61,62,--- , 0n}, and denoted 
by mj = (Aj)"3,7 = 1,-+> 8 and i = 1,2,--+ ,c, where 
c < 2”. The detailed process of the combination is listed as 
follows. Our proposed rule called LNS-CR for Large Number 
of Sources rule is composed of the four following steps: 

1) Cluster the simple bbas into c groups based on their focal 
element A;. For the convenience, each class is labeled 
by its corresponding focal element. 

2) Combine the bbas in the same group. Denote the com- 
bined bba in group A; by SSF 


Mk = (Ax)”*, k = 1,2,--- C. 


Let the number of bbas in group A, is sx. If the 
conjunctive rule is adopted, we have 


[w 
ik= ©  mj= (Ap) 


jJ=1,; = ,Sk 


(30) 


3) Reliability-based discounting. Suppose the fused bba of 
all the mass functions in A, is mg. At this time, each 
group can be regarded as a source, and there are c 
sources in total. The reliability of one source can be 
estimated as compared to a group of sources. In our 
opinion, the reliability of source A; is related to the 
proportion of bbas in this group. The larger the number 
of bbas in group A, is, the more reliable A; is. Then 
the reliability discounting factor of fry can be defined 


as: 
S 


D 


(31) 


Qk = 





k 
4 
Si 

i=1 

In order to keep the mass function representing total 
ignorance as a neutral element of the rule, in Eq. (31) we 
let ap = 0 for the group with A; = O. Another version 
of the discounting can be given by a factor taking into 
account the precision of the group by: 


n 
Qk = Di j (32) 
5 pisi 
i=1 
where lol 
= 33 
Br Au (33) 


Parameter 7 can be used to adjust the precision of the 
combination results. The larger the value of 7 is, the 
less imprecise the resulting bba is. The discounted bba 
of riz can be denoted by SSF rv, = (Ax) with ù, = 
1— ak +takÛk. AS we can see, when the number of bbas 
in one group is larger, a is closer to 1. That is to say, 
the fused mass in this group is more reliable. 

4) Global combine the fused bbas in different groups using 
the conjunctive rule: 


Av 
Mins-cR= © m= © (Ap). 
k=1,--- ,c k=1,---,c 


(34) 


Remarks: 

e The reliability estimation method proposed here is very 
simple compared with the previous mentioned methods 
in Section II-C, where usually the distance between bbas 
should be calculated or a special learning process is 
required. In the LNS-CR rule, to evaluate the reliability 
discounting factor, we only need to count the number of 
SSFs in each group. Note that other reliability estimation 
methods can also be used here. 

e In the last step of combination, as the number of mass 
functions that take part in the global combination is 
small (at most 2”), other combination rules such as DP 
rule and PCR rules are also possible in practice instead 
of Eq. (34). 


B. LNSa-CR rule for the approximated combination 


If there is a large number of mass functions in each group, 
an approximation method is suggested here to calculate the 
combined mass in the given group. Suppose the mass functions 


in group with focal element A; (k = 1,2,--- ,c) are: 
1— Wj A = Ak, 

m;(A) = Wj A=0, O<wu; <1,7 =1,2,--- , 8x. 
0 otherwise, 


(35) 
The combination of the masses in this group using the con- 
junctive rule is 


Sk 
1- II Wj A = Ák, 
j=1 


MlA) = ¢ 74 i A=6, (36) 
j=l 
0 otherwise. 
It is easy to get 
1 A=Akzx, 
lim rx(A)= 40 A=O, (37) 
Sk 0 


0 otherwise. 


This is an illustration of the conjunctive property. After the 
discounting with factor a;, the fused bba using for the global 
combination is 


Qk A= Ag, 
lim m,(A)=<1l-a, A=89, (38) 
i 0 otherwise. 
It can be represented by SSF 
Mp = (Ag), (39) 


where a, is shown in Eq. (31) or (32). If the conjunctive rule 
is adopted for the global combination at step 4, the final bba 
we get is 

Minsacr = O(A) T. 


In this approximate rule for the large number of sources, 
the initial mass functions is no longer considered, and the 
combination process of the bbas inside each group is not 
required any more. This can accelerate the algorithm to a 


(40) 


large extent. The LNS-CR and LNSa-CR rule provide different 
results when the number of sources is small. However, when 
the number of sources is large enough, they can be regarded 
as equivalent. 


C. Properties 


The proposed rule is commutative, but not associative. The 
tule is not idempotent, but there is no absorbing element. The 
vacuous mass function is a neutral element of the LNS-CR 
tule. 


There are four steps when applying LNS-CR rulet: decom- 
position (not necessary for simple support mass functions), 
inner-group combination, discounting and global combination. 
The LNS-CR rule has the same memory complexity as some 
other rules such as conjunctive, Dempster and cautious rules 
if all the rules are combined globally using FMT method. Only 
DP and PCR6 rules have higher memory complexity because 
of the partial conflict to manage. Suppose the number of mass 
functions to combine is S, and the number of elements in the 
frame of discernment is n. The complexity for decomposing! 
mass functions to SSFs is O(Sn2”). For combining the 
mass functions in each group, due to the structure of the 
simple support mass functions, we only need to calculate the 
product of the masses on only one focal element ©. Thus 
the complexity is O(S). The complexity of the discounting 
is O(2”). In the process of global combination, the bbas 
are all SSFs. If we use the Fast Mobius Transform method, 
the complexity is O(n2”). And there are at most 2” mass 
functions participating the following discounting and global 
conjunctive combination processes. Since in most application 
cases with a large number of mass functions, we have 2” < S, 
the last two steps are not very time-consuming. The total 
complexity of LNS-CR is O(Sn2" + S + 2” + n2”) and so 
is approximately equivalent to O(.S'n2”). 


For the approximate method, we can also save the time for 
inner combination and the discounting. The fused mass in each 
group is calculated by the proportions, and the complexity is 
also O(S). Although the approximate method does not reduce 
the complexity, in the experimental part, we will show that it 
will save some running time in applications when S is quite 
large. 

We remark here that one of the assumptions of LNS-CR 
tule is that the majority of sources are reliable. However, this 
condition is not always satisfied in every applicative context. 
Consider here an example with two sensor technologies: TA 
and TB. The system has two TA-sensors (Sı and S2), and 
one TB-sensor S3. Suppose also a parasite signal causes TA 
sensors to malfunction. In this situation, the majority of sen- 
sors are unreliable. And we could not get a good result if the 
LNS-CR rule is used directly as LNS-CR(S1, S2, S3) at this 
time. Actually there is an underlying hierarchy in the sources 
of information, LNS-CR rule could be evoked according to 
the hierarchy, such as LNS-CR(LNS-CR (51, 52), S3). We will 
study that more in the future work. 


+The source code for LNS-CR rule can be found in R package ibelief [53]. 
In the decomposing process, the Fast Möbius Transform method is used. 


IV. EXPERIMENTS 


In this section, several experiments will be conducted 
to illustrate the behavior of the proposed combination rule 
LNS-CR and to compare with other classical rules. Some 
different types of randomly generated mass functions will be 
used. The function RandomMass in R package ibelief [53] is 
adopted to generate random mass functions [54]. 
Experiment 1 (Elicitation of the majority opinion). In some 
applications, the elicitation of the majority opinion is very 
important. In this experiment, it is assumed that reliable 
sources can provide some imprecise and uncertain information, 
which is assumed to be in the form of the mass functions 


mj (j = 1,2,---,6) over the same discernment frame 
© = {61, 02, 03}: 

mı : mı({91}) = 0.12, mı (©) = 0.88, 

mə : mə ({01}) = 0.16, m2(O) = 0.84, 

ms : ma({01}) = 0.15, m3(@) = 0.85, 

m4 : ma({O1}) = 0.11, m4(O) = 0.89, 

ms : ms({091}) = 0.14, ms (©) = 0.86, 

me : me({02}) = 0.95, me(©) = 0.05 


As can be seen, the first five sources share similar belief 
(supporting {01 }) whereas the sixth one delivers a mass func- 
tion strongly committed to another solution (supporting {02 }). 
These six mass functions cannot be regarded as conflicting, 
because the majority of evidence shows the preference of {6, }. 
Here, source 6, is assumed not reliable since it contradicts with 
all the other sources. 

The combination results by conjunctive rule, Dempster 
rule, disjunctive rule, DP rule, PCR6 rule, cautious rule, 
average rule and the proposed LNS-CR rule are depicted 
in Table I. As can be observed, the conjunctive rule assigns 
most of the belief to the empty set, regarding the sources 
as highly conflictual. Dempster rule, DP rule, PCR6 rule 
and average rule redistribute all the global conflict to other 
focal elements. The disjunctive rule gives the total ignorance 
mass functions. The cautious rule and the proposed LNS-CR 
rule keep some of the conflict and redistribute the remaining. 
But the belief given to {02} is more than that to {01} when 
using Dempster, DP, PCR6, cautious and the average rules, 
which indicates that these rules are not robust to the unreliable 
evidence. The obtained fused bba by the proposed rule assigns 
the largest mass to focal element {6,}, which is consistent with 
the intuition. It keeps a certain level of global conflict, and at 
the same time reflects the superiority of {01} compared with 
{62}. From the results we can see that only the LNS-CR rule 
can correctly elicit the major opinion. 

The LNS-CR rule is a conjunctive based combination rule 
for mass functions with different reliability degrees. As men- 
tioned before, the principle of the LNS-CR rule is similar 
that of Schubert’s method [32]. Table II lists the results by 
Schubert’s combination method with different values of k. As 
can be seen, the result by the use of the LNS-CR rule is 
similar to that by Schubert’s method with a small value of 


8 As the focal elements are singletons except ©, parameter 7 has no effects 
on the final results when using LNS-CR rule. 


TABLE I 
THE COMBINATION OF SIX MASSES. FOR THE NAMES OF COLUMNS, ij IS USED TO DENOTE {0;, 6;}. 





Conjunctive Dempster Disjunctive DP PCR6 Cautious Average LNS-CR 
0 0.49313 0.00000 0.00000 0.00000 0.00000 0.15200 0.00000 0.06849 
{61 } 0.02595 0.05120 0.00000 0.02595 0.04783 0.00800 0.11333 0.36408 
{62} 0.45687 0.90136 0.00000 0.45687 0.56639 0.79800 0.15833 0.08984 
{01,62} 0.00000 0.00000 0.00004 0.49313 0.00000 0.00000 0.00000 0.00000 
{63 } 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 
{61,63} 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 
{62,63} 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 

0.02405 0.04744 0.99996 0.02405 0.38578 0.04200 0.72833 0.47759 





threshold k. When k is set small, the discounting process in 
Schubert’s method needs more steps. And in each step, the 
conjunctive rule should be evoked to calculate the falsity. It is 
more complex compared with the reliability estimation process 
of the LNS-CR rule in that sense. 


TABLE II 
THE COMBINATION OF SIX MASSES BY SCHUBERT’S METHOD WITH 
DIFFERENT VALUES OF k. 








k 0.1 0.2 0.3 0.4 0.5 
0 0.09776 0.19471 0.28680 0.37803 0.46444 
{81} 0.32187 0.26219 0.19350 0.12081 0.04980 
{02} 0.13521 0.23145 0.31033 0.37979 0.43871 
{81,02} 0.00000 0.00000 0.00000 0.00000 0.00000 
{03} 0.00000 0.00000 0.00000 0.00000 0.00000 
{01,43} 0.00000 0.00000 0.00000 0.00000 0.00000 
{02,43} 0.00000 0.00000 0.00000 0.00000 0.00000 
(S) 0.44516 0.31165 0.20937 0.12137 0.04704 





We also compare with another reliability discounting based 
combination method proposed by Martin et al. [10]. Same 
as Schuberts method, after the reliability degree of each 
source is estimated, the bbas are discounted following with 
a conjunctive combination. There is a parameter A in the 
method to adjust the discounting factor. The results varying 
with different values of A are shown in Table III. We can see 
this rule is similar to LNS-CR rule when A is set to be around 
1. When A is not well set, the results are not good. Moreover, 
in this method, the distance between bbas should be calculated 
first. Consequently, it increases the complexity and makes the 
method not feasible for combining a large number of sources. 


TABLE III 
THE COMBINATION OF SIX MASSES BY MARTIN’S METHOD WITH 
DIFFERENT VALUES OF A. 








AÀ 0.1 0.5 1 1.5 2 
0 0.00000 0.00350 0.10485 0.23330 0.31956 
{81} 0.00000 0.21206 0.34700 0.26789 0.19410 
{62} 0.00000 0.01272 0.12719 0.23219 0.30256 
{01,42} 0.00000 0.00000 0.00000 0.00000 0.00000 
{63} 0.00000 0.00000 0.00000 0.00000 0.00000 
{61,43} 0.00000 0.00000 0.00000 0.00000 0.00000 
{62,43} 0.00000 0.00000 0.00000 0.00000 0.00000 

1.00000 0.77172 0.42096 0.26661 0.18378 





Experiment 2 (The discounting mechanism). In this experi- 
ment, we will discuss the reliability discounting mechanism 
of the LNS-CR rule. Two reliability discounting methods 
proposed by Schubert [32] and Martin et al. [10] will be used 


to compare. Same as the LNS-CR rule, after the discounting 
process by these two methods, the conjunctive rule is adopted 
to combine the new mass functions. For simplicity, here we 
call the combination rule, where the Schubert’s discounting 
method (or Martin’s discounting method) is first evoked and 
then the conjunctive combination rule is used, “Schubert’s 
method” (Martin’s method, correspondingly). A set of 3 * x 
bbas on a frame of discernment © = {61,02} are generated, 
x of them are unreliable while 2 x x are reliable. The reliable 
sources assign a large mass to the singleton {04}. The unre- 
liable sources assign a large mass to the singleton {02}. The 
gain factor for sequential discounting in Schubert’s method is 
set to be 0.1 here. Schubert and Martin’s methods are evoked 
with different values of k and A respectively. Let x = 10, the 
fused bbas by the use of different rules are listed in Table IV. 

From the table we can see, the behavior of Martin’s dis- 
counting method is similar to that of LNS-CR rule when 
À is set around 0.4. The conjunctive combination based on 
Schubert’s discounting does not give any belief to {02} and 
© = {01,02} at all although there are 1/3 of sources 
supporting {02}. Moreover, when k is larger, most of the 
mass is assigned to the empty set in this rule. From these 
results we can see that only LNS-CR rule can give more belief 
on {6,} which can be regarded as the major opinion. The 
time elapsed for Schubert’s method with different values of 
threshold k is listed in Table V. The smaller the value of 
k is, the more discounting steps are required in Schubert’s 
method. Consequently, the time consumption becomes larger. 
The running time for both LNS-CR rule and Martin’s method 
is less than one second. Schubert’s method is much more time- 
consuming. 


We have also tested the combination methods based on the 
discounting factors proposed by Schubert [32] and Martin et al. 
[10] on some simple support mass functions with arbitrary 
focal elements. The results are not shown here as we can get 
similar conclusions from the results: The reliability estimation 
process of these methods takes more time compared with 
that of LNS-CR rule. The behavior of these two methods is 
similar to that of LNS-CR rule when the parameter k or A 
is set to be in a fixed range. But they are much more time- 
consuming compared with LNS-CR rule. This confirms that 
the reliability discounting method in LNS-CR rule is effective 
for the following conjunctive combination. 


Experiment 3 (The influence of parameter n). We test here the 


TABLE IV 
THE COMBINATION RESULTS BY DIFFERENT RULES. 





Schubert’s method Martin’s method LNS-CR 
k=0.2 k=03 k=0.5 k=0.7 | AX=03 A=04 A=0.6 A=1 
0 0.19949 0.29860 0.49704 0.69306 | 0.00248 0.10019 0.60681 0.98649 0.15060 
{01} | 0.80051 0.70140 0.50296 0.30694 | 0.16901 0.56713 0.38729 0.01351 0.48612 
{62} | 0.00000 0.00000 0.00000 0.00000 | 0.01200 0.04995 0.00360 0.00000 0.08593 
(S) 0.00000 0.00000 0.00000 0.00000 | 0.81650 0.28274 0.00230 0.00000 0.27735 

















TABLE V 
TIME ELAPSED FOR SCHUBERT’ S METHOD WITH DIFFERENT VALUES OF k. 





1 2 3 4 5 6 7 8 9 
k 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 
Time Elapsed (s) 46.81 21.64 13.46 9.28 664 4.88 367 273 179 








influence of parameter 7 in the LNS-CR rule. Simple support 
mass functions are utilized in this experiment. Suppose that the 
discernment frame under consideration is © = {0}, 62, 03}. 
Three types of SSFs are adopted. First sı = 60 and sə = 50 
SSFs with focal elements {0; } and {62} respectively (the other 
focal element is ©) are uniformly generated, and then s3 = 50 
SSFs with focal element 623 = {62,03} are generated. The 
value of masses are randomly generated. Different values of 
7 (see Eq. (32)) ranging from 0 to 6 are used to test. The 
mass values in the fused bba by LNS-CR varying with 7 
are displayed in Figure 1.a, and the corresponding pignistic 
probabilities are shown in Figure 1.b. 
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From these figures, we can see that 7 can have some 
effects on the final decision. Figure 1.a shows that with the 
increasing of 7, the mass assigned to the singleton focal 
elements increases. On the contrary, the mass given to the focal 
element whose cardinality is bigger than one decreases. In 
fact parameter 7 in LNS-CR aims at weakening the imprecise 
evidence which gives only positive mass to focal elements 
with high cardinality, and the exponent 7 allows to control -J-e betP(6) 
the degree of discounting. If 7 is larger, more weight is given era le 
to the sources of evidence whose focal elements are more 
specific, and more discount will be committed to the imprecise 
evidence. As a result, in the experiment when 77 is larger than x 
1.2, BetP (01) > BetP(02) (Figure 1.b). At this time the mass 
functions with focal element {62,03} make little contribution 
to the fusion process, while the final decision mainly depends 


on the other two types of simple support mass functions with a 

singletons as focal elements. j li A 
In real applications, 7 could be determined based on specific 

requirement. This work is not specially focusing on how to 

determine 7, thus in the following experiment we will set 

7 = 1 as default. T T T T T T T 

Experiment 4 (The principle for the global conflict). The 

goal of this experiment is to show how Dempster’s degree 

of conflict is dealt with by most of rules when combining a 

large number of conflicting sources. Fig. 1. Combination results for three types of SSFs using LNS-CR rule. 
In this experiment, the frame of discernment is set to The mass functions are generated randomly, and LNS-CR rule is evoked with 

different values of 7 ranging from 0 to 6. 

© = {01,02}. Assume that there are only 2 focal elements 

on each bba. One is the whole frame O, and the other is any 

of the singletons ({6,} or {@2}). The number of bbas which 
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b. Pignistic probability 


have the focal element {0 } is denoted by sı, while that with 
{02} is s2. We first fix the value of s2, and let sı = t * s2, 
with t a positive integer. We generate S = sı + s2 such kind 
of bbas randomly, but only withholding the bbas for which the 
mass value assigned to {01} or {02} is greater than 0.5. 

Four values of t are considered here: t = 1,2,3,4. If t= 1, 
Sı = s2 = S/2. If t = 2, the number of mass functions 
supporting {01} is two times of that supporting {02}, and so 
on. The global conflict (mass given to the empty set) after the 
combination with different values of sə for the four cases is 
displayed in Figures 2— 5 respectively. The mass assigned to 
the focal element {6;} with different combination approaches 
is shown in Figures 6 — 9. 
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Fig. 2. The global conflict after the combination with s2 ranging from [0,100] 
and sj = s2. 
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Fig. 3. The global conflict after the combination with s2 ranging from [0,100] 
and sj = 2 * s2. 
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Fig. 4. The global conflict after the combination with s2 ranging from [0,100] 
and sj = 3 * s2. 
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Fig. 5. The global conflict after the combination with s2 ranging from [0,100] 
and sj = 4 * so. 
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Fig. 6. The mass on {61} after the combination with s2 ranging from [0,100] 
and sı = s2. 
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Fig. 7. The mass on {01 } after the combination with s2 ranging from [0,100] 
and sj = 2 * s2. 
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Fig. 8. The mass on {0 } after the combination with s2 ranging from [0,100] 
and sj = 3 * s2. 


It is intuitive that when t becomes larger, the global conflict 
should be smaller and we should give more belief to the focal 
element {6,}. From Figures 2 — 9 we can see that only the 
results by LNS-CR rule are in accordance with this common 
sense. The simple average rule assigns larger bba to {0}, 
but it does not keep any conflict. In Figures 6 — 9, the mass 
given to {91} by Dempster rule cannot be displayed when S 
is large (and also for some small S), because in these cases 
the global conflict is 1 and the normalization could not be 
processed. As we can see, Dempster rule could not work at 
all when s2 is larger than 20. Although the conjunctive rule and 
cautious rule could work when combining a larger number of 
mass functions, the obtained fused mass function is m(Q) ~ 1, 
which is useless for decision in practical situations. 
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Fig. 9. The mass on {61} after the combination with s2 ranging from [0,100] 
and sj = 4 * s9. 


The results also confirm the equivalent of the LNS-CR 

rule and LNSa-CR rule when the number of sources is large, 
although the results provided by the two rules are not the same 
when there are not many mass functions to combine. From 
Figures 2 — 5 we can see a kind of limit of the global conflict 
for the LNS-CR rule. In fact, the mass on the empty set for 
this rule depends on the size of the frame of discernment and 
more directly on the number of groups created in the first step 
of the rule. The limit value of the global conflict will tend to 1 
with the increase of the size of discernment when considering 
only categorical bbas on different singletons. 
Experiment 5 (The complexity). In this experiment, the 
complexity of LNS-CR rule will be compared with other 
combination rules in terms of time consumption. Simple 
support mass functions defined on a frame of discernment with 
eight elements are considered first. The focal elements of each 
bba are set to be a random subset of © and © itself. The time 
elapsed (and also the log value of the time elapsed) with the 
number of sources S varying from 10,000 to 100,000 is shown 
in Figure 101. We can see that the running time of LNS-CR is 
much smaller than that of the conjunctive rule. LNSa-CR rule 
takes almost the same time as cautious rule. Average rule is 
the best among the five rules. As S' increases, the application 
of LNSa-CR rule can save more time compared with the use 
of LNS-CR rule. The increment of time consumption with 
respect to S is moderate. This tends to show that LNS-CR rule 
is suitable for combining a large number of SSFs. Remark that 
the decomposition process is not required when the cautious 
rule or LNS-CR(a) rule is adopted for combining SSFs. 


As mentioned before, for the combination of general separa- 
ble mass functions (not SSFs), LNS-CR needs four steps: de- 
composition, inner-group combination, discounting and global 
combination. The difference between the combination of any 
kind of separable bbas and of SSFs is the decomposition 
process, which is not necessary for the latter. We have designed 
another experiment on consonant bbas! over a frame of 


‘The result of Dempster rule is the same as that of conjunctive rule. 
|All consonant bbas are separable. 
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Fig. 10. Time lapse for combining SSFs. 


discernment with eight elements, and the number of focal 
elements is set to 5. The focal elements are randomly set to 
five nested subsets of ©, and the mass values are generated 
uniformly. The average running time (and the log value of the 
running time) of 10 trials by the use of different combination 
rules with different number of sources S is displayed in Figure 
11.a (and Figure 11.b)**. In order to show the complexity of 
LNS-CR rule more clearly, the elapsed time in each of the 
four steps is shown in Figure 12. 


As we can see from these figures, the time consumption 
of LNS-CR is significantly smaller than the cautious rule, 
but a little worse than the conjunctive rule and the average 
tule. Although the complexity of cautious rule is the same 
as LNS-CR rule and both of them require a decomposition 
process, it takes more running time than LNS-CR rule. The 
reason may be the different combination approach for the 
mass functions in the same group. The complexity of that 
process by cautious rule is O(.S2”) (The calculation is to find 


**The result of cautious rule is not displayed for large S, as it has been 
already shown that cautious rule is significantly worse than the other rules in 
terms of time consumption when S is small. 
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Fig. 12. Time lapse of each step using LNS combination rule with S' varying 
from 10,000 to 100,000. 


the minimum of each row in a S x 2” matrix), while for 
LNS-CR is O(S). LNSa-CR is faster than LNS-CR when S' 
is large. Figure 12 shows that the most time-consuming step in 


LNS-CR rule is the decomposition. Moreover as S increases, 
the increase of time lapse for the inner-group combination, 
discount, and global combination is limited. This is compliant 
with the complexity analysis of each step for LNS-CR rule 
in Section III-C. In many applications the mass functions are 
directly SSFs in which case there is no need to perform the 
decomposition, and LNS-CR is the best choice to fuse a large 
number of bbas. 


V. PERSPECTIVE ON APPLICATIONS 


Pattern recognition is a class of problems where the theory 
of belief functions has proved to allow increased performances 
[2]. In such problems we can be facing many bbas to combine. 
Denceux [2] proposed Evidential KNN method (EKNN) as an 
extension of KNN in the framework of the theory of belief 
functions to better model the uncertainty in neighbor point 
interactions. The Dempster rule is adopted to combine the 
mass evidence from K neighbors in EKNN. 

The problem considered here is to classify an input pattern x 
into n categories or classes, denoted by © = {61, 62,--- , An}. 
The available information is assumed to consist of a training 
set L = f(a), 9M), (x), 90)),... (a), a} of N 
patterns a) i= 1,2,---,N with known class labels 0 € 
©. To classify pattern x, each pair (2, a) constitutes a 
distinct item of evidence regarding the class membership of z. 
If the K nearest neighbors according to the distance measure 
are considered, K items of evidence can be obtained. These 
bbas can be constructed according to a relevant metric between 
pattern x and its j™ neighbor a) 


mi({Oq}) = a¢(d), 
m(@) =1—ad(d), 


mi(A) =0 VA € 2° \ {{6,}, O}, (41) 


where d™ is the (Euclidean) distance between «x and its qe 
neighbor x) with class label 6) = 04, a is a discounting 
parameter and ¢(-) is a decreasing function on R* defined as 


$(d\) = exp (n (æ) 


with yq being a positive parameter associated to class 04. It 
can be heuristically set to the inverse of the mean Euclidean 
distance between training data belonging to class 04. In EKNN, 
the K bbas for each neighbor are aggregated using the 
Dempster rule to form a resulting bba. A decision has to be 
made regarding the assignment of sample æ to one individual 
class. The maximum of pignistic probability can be used for 
decision-making. 


(42) 


A. A small data set with noisy training sample 


Figure 13 illustrates a simple two-class (red circle and green 
triangle) data set, where there are seven objects in each class. 
The pattern x marked by blue star is the sample data to be 
classified. The K bbas using the distance to its neighbor could 
be constructed by Eq. (41), and the five nearest neighbors are 
denoted by N; orderly in the figure. Set a = 0.95 and y; is 
the inverse of the average distance between the points in class 


0i, i = 1,2. The fused mass function by different combination 
rules with K = 4 and K = 5 are listed in Table VI and VII 
respectively. 
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Fig. 13. A small data set. 
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Fig. 14. Pignistic probability. 


As we can see from Figure 13, pattern æ is closer to class 02. 
Among pattern x’s five nearest neighbor N;,j7 = 1,2,--- ,5, 
four belong to class 02 while only 1 to class 01. The real 
class of object N; is 6;, but it is located in the boundary 
of the class and far from the other data points in the class. 
It may be a noisy item of 6;. The standard KNN rule can 
correctly classify object x to 62 when K > 3. However, if the 
evidential KNN model is applied, due to the existence of a 
such neighbor, the behavior of the combination rules has been 
affected. From Table VI we can see, when K = 4, the fused 
bbas by all combination rules all assign more mass to 6; than 
to 02. Consequently, pattern x will be classified into class 6; 
if the pignistic probability is considered for making decision. 
The same phenomenon also occurs when K is smaller than 4 


(see Figure 14). When K = 5 (Table VII), only the LNS-CR 
rule could partition pattern x into class 62, which seems 
more reasonable. The pignistic probabilities (Figure 14) by the 
Dempster, conjunctive, cautious and average rules for class 
6, are significantly higher than those for class 62, even when 
K is large. These rules are not robust to the noisy training 
data. Pattern x could be correctly classified to 02 by LNS-CR 
rule when K is between 5 and 10. 

It is indicated that when there are some noisy data in the 
training data set, the performance of the combination rule 
may become worse with small K. We should increase K 
moderately to improve the performance of the classifier. But 
as we analyzed before, the existing combination rules do not 
work well for aggregating a large number of mass functions. 
This is a limit of the use of evidential classifier. 


TABLE VI 
THE FUSED BBA BY DIFFERENT COMBINATION RULES (K = 4). 











Conjunctive Dempster Cautious Average LNS-CR 

0 0.2009 0.0000 0.1473 0.0000 0.0377 

{81} 0.6771 0.8473 0.7307 0.2195 0.1818 

{82} 0.0279 0.0349 0.0205 0.0606 0.1339 

(S) 0.0941 0.1177 0.1015 0.7199 0.6466 
TABLE VII 


THE FUSED BBA BY DIFFERENT COMBINATION RULES (K = 5). 








Conjunctive Dempster Cautious Average LNS-CR 
0 0.2198 0.0000 0.1473 0.0000 0.0352 
{81} 0.6582 0.8436 0.7307 0.1756 0.1404 
{62} 0.0305 0.0391 0.0205 0.0541 0.1651 
(S) 0.0915 0.1172 0.1015 0.7703 0.6593 





B. Real data sets 


In this section, we consider some well known real data 
sets from the UCI repository? summarized in Table VII. 
The classification rates by using different combination rules 
in evidential KNN model are displayed in Figure 15. Note 
that the “leave-one-out” method is adopted here to test the 
classifier. 


TABLE VIII 
A SUMMARY OF UCI DATA SETS. 








Data set No. of objects No. of cluster No. of attributes 
Tris 150 3 4 

Yeast 1484 10 8 

Digits 5620 10 64 





As we can see from Figure 15, for all the three data sets, 
the performance is almost the same for the two combination 
rules, LNS-CR and DS, in terms of classification rates. But 
there is a little improvement by the use of LNS-CR rule when 
K is large. To make it clear, we specially depict the results on 
Digits data set in Figure 16. It is shown that when K > 12, 
the classification rates by the use LNS-CR rule are a little 
larger than those through DS rule. We show the mass given 
to the empty set (global conflict) after the combination using 


tt http://archive.ics.uci.edu/ml/datasets.html 


conjunctive rule and LNS-CR rule with different values of 
K in Figure 17. The y-axis is the maximal assignment to Ø 
among all the mass functions for the test data. As we can see, 
the global conflict tends to 1 quickly as K increases, while 
LNS-CR rule keeps a moderate degree of global conflict. As 
DS rule is a normalized conjunctive rule, there is not sense to 
normalize a mass assignment with high global conflict. 
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Fig. 15. Classification results with different values of K on UCI data set. 
In the figure, the legend “Iris-DS” means it is the classification rates on Iris 
data set using DS combination rule. Same as the other legends. 
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Fig. 16. Classification rates on Digits data set. 


C. Perspective 


The above two examples are just two perspectives on the 
application of LNS-CR rule. In the first example, there are 
some special noisy data in the training data set. At this time, 
the sources should not be considered with equal reliability. 
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Fig. 17. Global conflict using conjunctive rule and LNS-CR rule varying 
with different values of K. In the figure, the legend “Iris-DS” means it is 
the conflict on Iris data set using DS combination rule. Same as the other 
legends. 


In this situation, using the DS rule or the conjunctive rule 
in EKNN model could not get good results. In the second 
example, it is shown that the global conflict may tend to one 
quickly as increases. Sometimes we even could not do the 
normalization process for DS rule because of the machine 
precision. 

In real world social networks, the available information can 
be uncertain, or even noisy. At this time, if we want to do a 
classification task such as for recommendation, the conjunctive 
rule could not be applied as the sources are not all reliable. 
Even if the sources are reliable, the global conflict may tend to 
1 quickly if the bbas are not consistent. At this time, LNS-CR 
rule can be an alternative choice. In the future work, we will 
study how Dempster’s degree of conflict is distributed in the 
feature space, and to study what special information contained 
in the moderate degree of global conflict kept by LNS-CR rule. 


VI. CONCLUSION 


Uncertainty in big data applications has attracted more 
and more attention. The theory of belief functions is one 
of the uncertainty theories allowing a model to deal with 
imprecise and uncertain information. This theory is also well 
designed for information fusion. However, despite that a lot of 
combination rules have been proposed in recent years in this 
framework, they are not able to combine a large number of 
sources because of the complexity or the absorbing element. 

In this paper, a new combination rule, named LNS-CR rule, 
preserving the principle of the conjunctive rule is proposed. 
This rule considers the mass functions given by the sources 
and groups them according to their set of focal elements 
(without auto-conflict). The mass functions of each group can 
be summarized by one mass function after combination. The 
reliability of the source is estimated by the proportion of bbas 
in one group. Therefore, after discounting the mass function of 
each group by the reliability factor, the final combination can 
be proceeded by the conjunctive rule (or another rule according 


to the application). If the number of sources in each group is 
high enough, an approximation method is presented. 

The LNS-CR rule is able to combine a large number of 
sources. The only existing method allowing to combine a large 
number of mass functions is the average rule. However, that 
rule may give more importance to few sources with a high 
belief (even if the source is not reliable) and cannot capture 
the conflict between the sources. The proposed rule with a 
reasonable complexity (lower than the DP and PCR6 rules) 
can provide good combination results. 

Overall, this work provides a perspective for the applica- 
tion of belief functions on big data. We will study how to 
apply LNS-CR rule on the problems of social network and 
crowdsourcing in the future research work. 
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